Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Sync segments in refresh listener on refresh after commit #10830

Merged
merged 3 commits into from
Oct 23, 2023

Conversation

ashking94
Copy link
Member

@ashking94 ashking94 commented Oct 22, 2023

Description

With this PR, we are doing following changes -

  1. On replica promotion, we update the segments tracker with remote uploaded files on remoteDirectory.init() method invocation in runAfterRefreshExactlyOnce(..). This ensures that we do not see spikes in refresh lag bytes on failovers.
  2. There is a case during failover where the runAfterRefreshExactlyOnce gets executed, but the syncSegments does not. This happens due to the this.primaryTerm != indexShard.getOperationPrimaryTerm() evaluating true for the first time, but evaluating to false on the second time.

Related Issues

Resolves #10821, #10831

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework Storage:Remote v2.12.0 Issues and PRs related to version 2.12.0 labels Oct 22, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Oct 22, 2023

Compatibility status:

Checks if related components are compatible with change 17b9dcd

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Oct 22, 2023

Codecov Report

Merging #10830 (17b9dcd) into main (51626d0) will decrease coverage by 0.03%.
Report is 9 commits behind head on main.
The diff coverage is 90.04%.

@@             Coverage Diff              @@
##               main   #10830      +/-   ##
============================================
- Coverage     71.31%   71.29%   -0.03%     
- Complexity    58671    58728      +57     
============================================
  Files          4860     4869       +9     
  Lines        276335   276490     +155     
  Branches      40198    40204       +6     
============================================
+ Hits         197068   197123      +55     
- Misses        62803    62886      +83     
- Partials      16464    16481      +17     
Files Coverage Δ
...upport/replication/TransportReplicationAction.java 77.10% <100.00%> (-3.58%) ⬇️
...ava/org/opensearch/cluster/node/DiscoveryNode.java 91.62% <100.00%> (+0.17%) ⬆️
...a/org/opensearch/common/network/NetworkModule.java 92.20% <100.00%> (+0.20%) ⬆️
...rg/opensearch/common/settings/ClusterSettings.java 92.85% <ø> (ø)
.../java/org/opensearch/gateway/GatewayMetaState.java 69.56% <100.00%> (+1.04%) ⬆️
...earch/index/remote/RemoteStorePressureService.java 96.61% <ø> (-3.39%) ⬇️
...ch/index/remote/RemoteTranslogTransferTracker.java 80.97% <100.00%> (+2.06%) ⬆️
...in/java/org/opensearch/index/shard/IndexShard.java 70.56% <100.00%> (+0.81%) ⬆️
...rg/opensearch/index/translog/RemoteFsTranslog.java 75.12% <100.00%> (+0.62%) ⬆️
server/src/main/java/org/opensearch/node/Node.java 85.31% <100.00%> (+0.09%) ⬆️
... and 17 more

... and 457 files with indirect coverage changes

@ashking94
Copy link
Member Author

Gradle Check (Jenkins) Run Completed with:

Flaky test - #2775

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@ashking94
Copy link
Member Author

Gradle Check (Jenkins) Run Completed with:

This is a build failure before the rebase of BWC PR that was causing this issue.

@ashking94
Copy link
Member Author

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled

Flaky test - #10755.

@sachinpkale sachinpkale merged commit 7453daa into opensearch-project:main Oct 23, 2023
16 checks passed
@sachinpkale sachinpkale added the backport 2.x Backport to 2.x branch label Oct 23, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 23, 2023
…mit (#10830)

* [Remote Store] Sync segments in refresh listener on refresh after commit

Signed-off-by: Ashish Singh <[email protected]>

* Add Integration Tests

Signed-off-by: Ashish Singh <[email protected]>

* Add comments and java doc

Signed-off-by: Ashish Singh <[email protected]>

---------

Signed-off-by: Ashish Singh <[email protected]>
(cherry picked from commit 7453daa)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
sachinpkale pushed a commit that referenced this pull request Oct 23, 2023
…mit (#10830) (#10849)

* [Remote Store] Sync segments in refresh listener on refresh after commit



* Add Integration Tests



* Add comments and java doc



---------


(cherry picked from commit 7453daa)

Signed-off-by: Ashish Singh <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…mit (opensearch-project#10830)

* [Remote Store] Sync segments in refresh listener on refresh after commit

Signed-off-by: Ashish Singh <[email protected]>

* Add Integration Tests

Signed-off-by: Ashish Singh <[email protected]>

* Add comments and java doc

Signed-off-by: Ashish Singh <[email protected]>

---------

Signed-off-by: Ashish Singh <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request skip-changelog Storage:Durability Issues and PRs related to the durability framework Storage:Remote v2.12.0 Issues and PRs related to version 2.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Remote Store] Sync segments in refresh listener on refresh after commit
3 participants